feat: updated user ontology#939
Conversation
📝 WalkthroughWalkthroughExtended the User object schema in Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (4)
services/ontology/schemas/user.json (4)
60-63: Consider standardizinginLanguageto BCP 47 language tags.Without a pattern constraint, this field may receive inconsistent values (e.g., "English", "en", "en-US"). Standardizing to BCP 47 tags improves interoperability:
♻️ Suggested: Add BCP 47 pattern for language codes
"inLanguage": { "type": "string", - "description": "The language associated with the user profile" + "pattern": "^[a-z]{2,3}(-[A-Z]{2})?$", + "description": "The language associated with the user profile (BCP 47 tag, e.g., 'en', 'en-US')" },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/ontology/schemas/user.json` around lines 60 - 63, The inLanguage string lacks validation and should be constrained to BCP 47 language tags: update the "inLanguage" schema entry to include a "pattern" property that enforces a BCP 47 regex (IETF language tag) and optionally add "examples" and a clarifying "description" note; target the "inLanguage" field in the JSON schema to ensure values like "en", "en-US", "zh-Hant" are valid while free-text values like "English" are rejected.
37-59: Well-structured address object with appropriate constraints.Good use of
additionalProperties: falseto enforce strict schema validation. The nested structure follows schema.org conventions.Consider standardizing
addressCountryto ISO 3166-1 alpha-2 codes (e.g., "US", "DE", "JP") for consistency and interoperability:💡 Optional: Constrain addressCountry to ISO 3166-1 alpha-2
"addressCountry": { "type": "string", - "description": "The country" + "pattern": "^[A-Z]{2}$", + "description": "The country (ISO 3166-1 alpha-2 code)" }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/ontology/schemas/user.json` around lines 37 - 59, Standardize the address country field by constraining "addressCountry" in the "address" object to ISO 3166-1 alpha-2 codes; update the schema for the "address" object (symbol: "address", property: "addressCountry") to either add an "enum" containing the two-letter country codes or a "pattern" like "^[A-Z]{2}$" and adjust the "description" to mention ISO 3166-1 alpha-2 format to ensure consistent validation and interoperability.
20-63: Downstream systems need updates to utilize new fields.Based on the codebase, the
UserOntologyDatainterface inplatforms/profile-editor/api/src/types/profile.tsand theUserentity inplatforms/profile-editor/api/src/database/entities/User.tsdo not include these new fields (givenName,familyName,telephone,address,inLanguage).If the profile-editor needs to store or expose these fields, corresponding updates to the interface and database entity (with migrations) will be required.
Additionally, the ontology service caches schemas at startup—a service restart is needed after deploying this schema change.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/ontology/schemas/user.json` around lines 20 - 63, Add the new fields from the ontology schema to the profile-editor types and DB entity: update the UserOntologyData interface (platforms/profile-editor/api/src/types/profile.ts) to include givenName, familyName, email, telephone, address, and inLanguage with appropriate types, then add corresponding columns/properties to the User entity (platforms/profile-editor/api/src/database/entities/User.ts) — decide whether address should be a JSON/embedded object or normalized columns and implement that choice; create and run a database migration to add any new columns, update any serialization/validation logic that maps between the interface and entity, and restart the ontology/caching service so it loads the updated schema at startup.
28-36: PII fields added - ensure downstream handling is compliant.The
format: emailfor validation. Bothtelephoneare PII and will require appropriate handling in consuming services (access controls, encryption at rest, audit logging per GDPR/CCPA if applicable).Consider adding a loose pattern for
telephoneto reject clearly invalid values while still allowing international formats:💡 Optional: Add basic telephone pattern
"telephone": { "type": "string", + "pattern": "^\\+?[0-9\\s\\-().]{7,20}$", "description": "The user's telephone number" },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/ontology/schemas/user.json` around lines 28 - 36, Mark the PII fields and add a loose international telephone validation: update the schema entries for "email" and "telephone" so they explicitly flag sensitivity (e.g., add a custom metadata property like "x-pii": true or "sensitive": true on both fields) and add a permissive pattern constraint to "telephone" that allows digits, optional leading "+", spaces, parentheses and hyphens with a sensible minimum length (e.g., 7 characters) to reject clearly invalid values while permitting international formats; keep the existing "format": "email" for "email" and add a brief description note about downstream handling (access controls/encryption/audit) in the "description" to prompt consumers to treat these fields as sensitive.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@services/ontology/schemas/user.json`:
- Around line 60-63: The inLanguage string lacks validation and should be
constrained to BCP 47 language tags: update the "inLanguage" schema entry to
include a "pattern" property that enforces a BCP 47 regex (IETF language tag)
and optionally add "examples" and a clarifying "description" note; target the
"inLanguage" field in the JSON schema to ensure values like "en", "en-US",
"zh-Hant" are valid while free-text values like "English" are rejected.
- Around line 37-59: Standardize the address country field by constraining
"addressCountry" in the "address" object to ISO 3166-1 alpha-2 codes; update the
schema for the "address" object (symbol: "address", property: "addressCountry")
to either add an "enum" containing the two-letter country codes or a "pattern"
like "^[A-Z]{2}$" and adjust the "description" to mention ISO 3166-1 alpha-2
format to ensure consistent validation and interoperability.
- Around line 20-63: Add the new fields from the ontology schema to the
profile-editor types and DB entity: update the UserOntologyData interface
(platforms/profile-editor/api/src/types/profile.ts) to include givenName,
familyName, email, telephone, address, and inLanguage with appropriate types,
then add corresponding columns/properties to the User entity
(platforms/profile-editor/api/src/database/entities/User.ts) — decide whether
address should be a JSON/embedded object or normalized columns and implement
that choice; create and run a database migration to add any new columns, update
any serialization/validation logic that maps between the interface and entity,
and restart the ontology/caching service so it loads the updated schema at
startup.
- Around line 28-36: Mark the PII fields and add a loose international telephone
validation: update the schema entries for "email" and "telephone" so they
explicitly flag sensitivity (e.g., add a custom metadata property like "x-pii":
true or "sensitive": true on both fields) and add a permissive pattern
constraint to "telephone" that allows digits, optional leading "+", spaces,
parentheses and hyphens with a sensible minimum length (e.g., 7 characters) to
reject clearly invalid values while permitting international formats; keep the
existing "format": "email" for "email" and add a brief description note about
downstream handling (access controls/encryption/audit) in the "description" to
prompt consumers to treat these fields as sensitive.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0d453e72-6021-4dca-8869-30375bcb92d2
📒 Files selected for processing (1)
services/ontology/schemas/user.json
| "address": { | ||
| "type": "object", | ||
| "description": "The user's postal address", | ||
| "properties": { | ||
| "streetAddress": { | ||
| "type": "string", | ||
| "description": "The street address" | ||
| }, | ||
| "postalCode": { | ||
| "type": "string", | ||
| "description": "The postal code" | ||
| }, | ||
| "addressLocality": { | ||
| "type": "string", | ||
| "description": "The locality or city" | ||
| }, | ||
| "addressCountry": { | ||
| "type": "string", | ||
| "description": "The country" | ||
| } | ||
| }, | ||
| "additionalProperties": false | ||
| }, |
There was a problem hiding this comment.
Probably a good idea to make this a seperate ontology called address (a person may have work address and a home address or own many homes) and make it a 1:n relation from user's perspective
There was a problem hiding this comment.
We’ve reviewed the current model and believe that, within eVault, moving toward treating address as a first-class ontological entity is the right direction. In a proper, fully developed architecture, an address should not be just a string inside a user object — it should be a standalone entity with its own lifecycle, reusable across contexts, and capable of participating in relationships. This naturally leads to a many-to-many model: a user can have multiple addresses, and the same address can belong to multiple users. Moreover, for this to work correctly and avoid inconsistencies, it ideally requires a centralized source of truth — for example, a government-backed address registry with stable unique identifiers.
At the same time, we believe that implementing this model at the MVP stage introduces significant complexity. First, there is the problem of mapping between local and global identifiers: without a unified registry, there is no reliable way to match addresses, which quickly leads to duplication and inconsistency. Second, synchronization across platforms becomes non-trivial, as different systems may support different models (some allow only a single address, others multiple), and it becomes unclear how to interpret updates — whether a change represents an update to an existing address or the creation of a new one. Third, the many-to-many model itself requires careful handling of relationships: when multiple users are linked to the same address, or a user has multiple addresses, any change can affect multiple entities and their connections. Fourth, without a normalized source of addresses, duplicate records are inevitable — the same physical address may appear multiple times as separate entities. Additionally, update semantics become ambiguous: when a user enters a “new” address, it is unclear whether it should replace an existing one, be added as a new entry, or be matched against an existing record. Altogether, this makes the system significantly more complex and fragile, especially at the prototype stage.
For these reasons, while we fully support the long-term direction of modeling address as a separate entity with many-to-many relationships and integration with a centralized registry, we believe this is too heavy for the current MVP scope. As a pragmatic step, we propose simplifying the model for now and using a single address embedded directly within the user entity, without introducing a separate ontological entity or supporting multiple addresses. This approach avoids the mapping, duplication, and synchronization issues described above, simplifies integrations, and allows us to move faster toward a working MVP. At the same time, we see this as a temporary simplification, and we can later evolve the model toward a fully normalized solution with standalone address entities, many-to-many relationships, and integration with an external address registry.
Description of change
Extending user ontology
Type of change
Summary by CodeRabbit